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Abstract 

Q>^ I Algorithm and code to produce sequences whose members obey 

CNI ■ Gaussian distribution function is reported. Discreet and hmited num- 

ber of groups are defined in the distribution function, where each 
^^ I group is represented only with one value instead of a range of value. 

The produced sequences are also checked back whether they still fit 
the discreet distribution function. Increasing of number of particles A^ 
increases the value of correlation coefficient R^, but increasing number 
of groups M reduces it. Value i?^ = 1 can be found for A^ = 1000000 
at least with M = 5000 and for M = 10 at least with N = 1000. 
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1 Introduction 

Gaussian distribution function plays important role in many fields of science, 
such as in mathematical modeling pj, in physical sciences [2], in quantum 



chemistry [5], with integral in nuclear physics [1], and in semiconductor de- 
vices |2]. Then a need comes up how a sequence, that its members obey 
Gaussion distribution function, could be produced, since it is needed, for 
example in molecular dynamics simulatons |6]. A procedure to produce the 
sequences is presented in algorithm and C-|— I- code. 

2 Gaussian distribution function 

Gaussian or normal distribution function can be represented in the form of 
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where n is the average of z and a is the width of normal distribution curve. 
The factor in front of right side of Equation ([1]) is due to normalization of 
f{z) integral 
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Variable z is a certain parameter that obeys Gaussian distribution function, 
it can be particle velocity, particle diameter, or particle mass. 

2.1 Proof of normalization 

Equation ([2]) can be proved using 
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2.2 Meaning of fi and a 

Peak of f{z) is located at z = fi with value 



/max (2) = fifl) = 

and at 2; = /i ± |cr it gives 

/ ( /^ ± T^^ ) = —j^ exp 
\ z / (Tv27r 

An example of /(2;) is given in Figure [1] 
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Figure 1: An Gaussian distribution function with /i = 0.5 and a = 0.25/v27r. 



2.3 Number of particles 

Suppose that there are N particles in a system, that number of particles 
N{z) who has property of z is defined by 



N 
N{z) = — -== exp 
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2^2 
where according to Equation ([2]) it must hold that 
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N{z)dz = N. (7) 

In this case, it is considered that property z has only positive value. 

3 Discretization of distribution function 

It is imposible even with nowadays most advanced computer facilities to 
produce continue number of particles in order of one mole, which equals 
to about 10^^ particles. In this report only small number of particles is 
considered. The distribution function is also simplified by dividing it into 
limited and discreet groups of particles. Within each group there is only one 
value (certain property of particle) which represents the group instead of a 
range of value from minimum to maximum value of the group. 

3.1 Discreet groups 

Suppose that there is M groups of particles with equal width Az, which group 
the total number of particles N. First step is how to find z^[^ and z^g^^ where 
at these values N{z) can be considered zero. Since we deal with particles 
than it is more simple to use the int() function which returns the integer 
value of N{z). It means that f{z) is considered zero when N{z) = 1 — e, then 

N{z) = l-e, z < fi^ z = 2;min, (8) 

N{z) = l-e, z > fi^ z = 2;max, (9) 

with e a small defined value. Then width Az can be found through 

z = . 10) 

Group i is represented by Zi, which is 

z, = z^,, + (^ - ^) ^^' ^ = 1, 2, .., M - 1, M. (11) 



3.2 Member of each group 

As it has been declared previously, in group i there is only one value of z 
which is Zi. It is only for the sake of simplicity. Each group has number 
of particles that must obey the Gaussian distribution function. Number of 
particle in each group is 

^^.= (^)int[iV(^.)]- (12) 

Since there is a round down process (throug the int() function) for each 
group in order to find Ni from N{zi) then it can be concluded that 

M 

J2N^<N, (13) 

1=1 

a difference that deviates the discreet groups of particles from the Gaussian 
distribution function. The factor in front of right side of Equation (1T2|) is 
due to discreet number of particle groups. 

3.3 Algorithm to group the particles 

An algorithm of implementation of Equation ([H]) - f ll2p can be summerized 
as follow 

1 . start 

2. determine mu and sigma for distribution function N(z) 

3. determine epsilon 

4. set z = mu 

5. using root finding algoritm find root of N(z) - (1 - epsilon) 
= in range z < mu, it is named as zmin 

6. set z = mu 

7. using root finding algoritm find root of N(z) - (1 - epsilon) 
= in range z > mu, it is named as zmax 

8. determine number of group M 



9. calculate group width dz using Equation qiOD 

10. determine zi using Equation GllD for all M groups 

11. deterimine number of group i using Equation G12D 

12. calculate N' and normalize Ni with it 

13. stop 

4 The sequences 

In group i there are Ni particles which has a property Zi. The property can 
be velocity, mass, diameter, charge, or other physical properties. And there 
are M groups of particles. It means, when all the particles are lined in order 
to make sequences there will be S ways to rearrange the particles order. If 
the particles are distinguishable 

•^distinguishable -' ' • V / 

and when there are indistinguishable 

•J indistinguishable -i-rM aTT' \ / 

The later means that particles at the same group are identical, which means 
the particles are identify only by their property Zi. 

4.1 The zeroth sequence 

The easiest way to buid the sequence is by lining the particle from each group 
in incremental order, such as 

Zl, Zi, Z2, Z2, Z2, Z2, Z^, Z3, .., Zm, Zm- (16) 

This sequence is named as the zeroth sequence. 



4.2 Other sequences 

The sequences beside zeroth sequence can be generated by permutating ze- 
roth sequence. Number of sequences can be produced is according to Equa- 
tion f lT^ and flTSj) . In this report we propose a mechanism to generate a se- 
quence from zeroth sequence by using random () and swapO function which 
is aheady built-in in C++. The algorithm is as follow 

1 . start 

2. determine seed for random generator 

3. set the generator with the seed 

4. get the zeroth sequence that contains N particles 

5. particle number i = 1 

6. generate an integer number between 1 and N, say j 

7. swap value of particle i and j 

8. increase value of i by 1 

9. if i still less than or equal to M go to Step [6] 
10. stop 

Since random number generated by C++ random generator depends on the 
seed, than the sequence is reproducible. It means that the seed is as an 
idenfier to the sequence. 

5 Error 

The sum of generated value of Ni for each group i in a sequence will be less 
than total number of particle as given by Equation ( TT3l) . which means an 
error. This error can be calculated using a common correlation coefficient 
R^ formulation 

R' = l-^, (17) 

•J '-'tot 



where 



55err = ^[iV^-iV(;2,)]2, (18) 

i 

SS,,, = Y,{N^-N^)^ (19) 

i 

^^ = ]^E^- (20) 

i 

with A^' is total number of generated particles 

M 

N' = Y,N,. (21) 

Equation f lT7|) - f l2T|) will be used the next section to calculate the error in 
produced sequences. 



6 Results and discusion 

An illustration for two discreet Gaussian distribution function is given in 
Figure |2], which is produced by our program gaussg. It has been found that 
the value N' shown in Equation flT2|) can not be used in the continue function 
to fit the discreet values. Then, the new fitting function will be 



Nd{z) = — ^= exp 
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where 



NN' , , 

AT^ = . 23) 

The correlation coefficient in Equation (TT7|) is caculated using N^{zi) instead 
of Ar(^.). 

Variation of number of particles A^ and number of groups M are also 
observed as illustrated in Figure |3] and Figure HI respectively. It can be seen 
that larger A^ gives better W' and larger M gives bad W' . Number of groups 
should be more than or equal to N/M that the program gaussg can handled. 
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Figure 2: Example of discreet value of Gaussian distribution function gener- 
ated by gaussg with /i = 0.5 for a = 0.1, A^' = 994, A^^ = 109.665 (solid line 
and square mark) and a = 0.04, A^' = 998, A^^ = 45.3348 (dashed line and 
circle mark). 

The next results are the sequences that produced from N^i^Zi) as shown 
in Figure El Only first four seeds are used to generate four sequences. These 
sequences has the same distribution function, which has /i = 0.5,, a = 0.1, 
A^ = 100, and M = 10. These results are produced by program gausss. 



7 Conclusion 



Two programs, gaussg for creating discreet groups and gausss for creating 
sequences, have been devoleped and tested. The discreen Gaussian distri- 
bution function can be produced. The sequences which has the same dis- 
tribution function, can also be generated. Further investigation is needed 
how to register all available sequences for a distribution function. As A^ 
increases the value R^ approximates 1, but as M increases the value R^ de- 
crease less than 1. R^ = 1 can be achieved with larger A^ and smaller M. The 
discreet Gaussian distribution function has different constant with its previ- 
ously continuos distribution function which is used to generated the discreet 
and limited groups. 
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Figure 3: Dependence of correlation coefficient R^ on number of particles N 
for /i = 0.5, a = 0.1, and M = 10. 
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Appendix A: gaussg 



/* 



gaussg. cpp 

Generate discreet groups of Gaussian distribution function 
Authors are Sparisoma Viridi and Veinardi Suendo 
Version date is 2011.07.17 



*/ 



#include <iostreani> 
#include <fstreani> 
#include <stdlib.h> 
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#include <math.h> 



const double PI = 3.14159265; 



using namespace std; 



double Nz(double mu, double sigma, double N, double z) ; 



int maindnt argc, char **argv) { 
if(argc < 6) { 



cout « 
cout « 
cout « 
cout « 
cout « 
cout « 
cout « endl; 
cout « 



cout « endl 



cout « 
cout « 
cout « 
cout « 
cout « 
cout « 
cout « 



'Version date is 2011.07.17" « endl; 
'gaussg is written by Sparisoma Viridi 
'and Veinardi Suendo" « endl; 
'Generate discreet groups of particles 
'that obey Gaussian distribution "; 
'fuction" « endl; 



'Usage: gaussg mu sigma N M output-file" << endl; 



'All arguments are mandatory:" « endl; 
'mu average of Gaussian " ; 

'distribution function" « endl; 
'sigma width of Gaussian "; 
'distribution function" « endl; 
'N number of particles" << endl; 

'M number of groups" « endl; 

'output-file output file" << endl; 



cout « 
} else { 

double mu = atof (argv[l] ) ; 
double sigma = atof (argv [2] ) ; 
int N = atoi(argv[3] ) ; 
int M = atoi(argv[4] ) ; 
const char *of n = argv [5] ; 



cout « "mu = " « mu << endl; 

cout « "sigma = " << sigma << endl; 

cout « "N = " « N « endl; 

cout « "M = " « M « endl; 

cout « "output-file = " « ofn « endl; 
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double eps = lE-3; 
double dz = mu * lE-5; 
double NNz = N; 

double zmin = mu; 
while (NNz > eps) { 

NNz = Nz(mu, sigma, N, zmin); 

zmin -= dz; 
} 
cout « "zmin = " « zmin << endl; 

NNz = N; 

double zmax = mu; 

while (NNz > eps) { 

NNz = Nz(mu, sigma, N, zmax); 

zmax += dz; 
} 
cout << "zmax = " « zmax << endl; 

dz = (zmax - zmin) / M; 

cout « "dz = " « dz « endl; 

double zi [M] ; 

int Ni [M] ; 

double NN = 0; 

for(int i = 0; i < M; i++) { 

zi [i] = zmin + (i + 0.5) * dz; 

double z = zi [i] ; 

Ni [i] = (int) Nz(mu, sigma, N, z) ; 

NN += Ni [i] ; 
} 
cout « "N' = " « NN « endl; 

double NN2 = 0; 

for(int i = 0; i < M; i++) { 

Ni[i] = (int)(Ni[i] * (N/NN)); 

// cout « i « "\t"; 

// cout « Ni[i] « endl; 

NN2 += Ni [i] ; 
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cout « "N\" = " « NN2 « endl; 

double Nzi [M] ; 

double NN3 = 0; 

of stream fout; 

f out .open(ofn) ; 

fout « "#i\tzi\tNi\tN(zi)" « endl; 

for(int i = 0; i < M; i++) { 

fout « i + 1 « "\t' 

fout « zi[i] « "\t": 

fout « Ni[i] « "\t": 

double z = zi [i] ; 

NN3 = 1.0 * N * NN2 / NN; 

Nzi[i] = Nz(mu, sigma, NN3, z) ; 

fout « Nzi[i] « endl; 
} 

fout . close ; 
cout « "Mt = " « NN3 « endl; 

double SNi = 0; 

for(int i = 0; i < M; i++) { 

SNi += (Mi[i] * zi[i]); 
} 
double mui = SNi / NN2; 

double SStot = 0; 
double SSerr = 0; 
for(int i = 0; i < M; i++) { 

double dSStot = (Ni [i] - mui) * (Ni [i] - mui); 

SStot += dSStot; 

double dSSerr = (Ni [i] - Nzi [i] ) * (Ni [i] - Nzi [i] ) ; 

SSerr += dSSerr; 
} 

double R2 = 1 - SSerr/SStot; 
cout « "R~2 = " « R2 « endl; 
} 
return 0; 
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double Nz(double mu, double sigma, double N, double z) { 

double cl = N / (sigma * sqrt(2 * PI)); 

double c2 = exp(-(z - niu)*(z - mu) / (2 * sigma * sigma)); 

double c3 = cl * c2; 

return c3; 
} 



Appendix B: gausss 



/* 



gausss . cpp 

Generate sequences from discreet groups of Gaussian 

distribution function 

Authors are Sparisoma Viridi and Veinardi Suendo 

Version date is 2011.07.17 



*/ 



#include <iostream> 
#include <fstream> 
#include <stdlib.h> 
#include <math.h> 

const double PI = 3.14159265; 

using namespace std; 

int main(int argc, char **argv) { 
if(argc < 4) -[ 

cout « "Version date is 2011.07.17" « endl; 

cout « "gausss is written by Sparisoma Viridi " ; 

cout « "and Veinardi Suendo" « endl; 

cout « "Generate sequences from discreet groups "; 

cout « "of particles that\nobey Gaussian "; 

cout « "distribution fuction" « endl; 

cout « endl; 

cout « "Usage: seed input-file output-file" « endl; 

cout « endl; 
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cout << "All arguments are mandatory:" « endl; 
cout << "seed seed for random generator 
cout « "(1, 2, . .)"; 
cout << endl; 

cout << "input-file input file" << endl; 
cout << "output-file output file" << endl; 
} else { 

long int seed = atoi(argv[l] ) ; 
const char *ifn = argv[2]; 
const char *ofn = argv[3] ; 
double mu = atof (argv[l] ) ; 

cout << "seed = " « seed << endl; 
cout « "input-file = " « ifn « endl; 
cout « "output-file = " « ofn « endl; 

ifstream fin; 

f in.open(ifn) ; 

string buf ; 

double d; 

long int i = 0; 

whiledfin.eof 0) { 

fin » buf; 

i++; 
} 
f in.closeO ; 

int M = (int)((i - 1 - 4) / 4) ; 
double zi [M] , Nzi [M] ; 

f in.open(ifn) ; 

int j = 0; 

fin » buf; fin » buf; fin » buf; fin » buf; 

whiledfin.eof I I j < i) { 

int k; fin » k; 

fin » zi[k-l] ; 

fin » Nzi[k-1] ; 

fin » buf; 
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int N = 0; 

forCint 1=0; 1 < M; 1++) { 

zi[l] = 0.001 * round(zi[l] * 1000); 

N += Nzi [1] ; 
} 

double seqO [N] ; 

int k = 0; 

for(int m = 0; m < M; m++) { 

for(int 1 = Nzi[m]; 1 > 0; 1~) { 
seqO [k] = zi [m] ; 
k++; 
} 
} 

double seql [N] ; 

fordnt n = 0; n < N; n++) { 

seql [n] = seqO [n] ; 
} 

srandom(seed) ; 

fordnt n = 0; n < N; n++) { 

long int al = random () ; 

double a2 = 1.0 * al / RAND_MAX; 

int a3 = (int) (N * a2) ; 

swap (seql [n] , seql [a3] ) ; 



of stream fout; 

f out .open(ofn) ; 

fordnt n = 0; n < N; n++) { 
fout « seql [n] << endl; 

} 

fout .closed ; 
> 
return 0; 
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(b) 




(d) 
Figure 5: Sequences with seed: (a) 1, (b) 2, (c) 3, and (d) 4. 



